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Abstract 

We consider the problem of extracting entropy by sparse transformations, namely functions 
with a small number of overall input-output dependencies. In contrast to previous works, 
we seek extractors for essentially all the entropy without any assumption on the underlying 
distribution beyond a min-entropy requirement. We give two simple constructions of sparse 
extractor families, which are collections of sparse functions such that for any distribution X on 
inputs of sufficiently high min-entropy, the output of most functions from the collection on a 
random input chosen from X is statistically close to uniform. 

For strong extractor families (i.e., functions in the family do not take additional randomness) 
we give upper and lower bounds on the sparsity that are tight up to a constant factor for a wide 
range of min-entropies. We then prove that for some min-entropies weak extractor families can 
achieve better sparsity. 

We show how this construction can be used towards more efficient parallel transformation of 
(non-uniform) one-way functions into pseudorandom generators. More generally, sparse extrac- 
tor families can be used instead of pairwise independence in various randomized or nonuniform 
settings where preserving locality (i.e., parallelism) is of interest. 
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1 Introduction 



Randomness extractors [NZ93, Sha04], have numerous applications in complexity theory and cryp- 
tography. For example, in computational complexity they are used to isolate satisfying assignments 
of boolean formulas [VV86] and in constructing pseudorandom generators for space-bounded com- 
putation [Nis92, RR99]. One notable use in cryptography is in the construction of pseudorandom 
generators from one-way functions [HILL99, HRV10, VZ11]. 

In many applications of extractors, including the above ones, it is important that the extractors 
recover essentially all the entropy of the input distribution. A popular choice in such scenarios is to 
instantiate the extractor by a pairwise-independent hash function family [BBR88, ILL89]. Pairwise- 
independent functions are appealing because they have a variety of implementations, ranging from 
very simple ones [CW79] to very efficient ones [IKOS08]. 

Mansour, Nisan, and Tiwari [MNT93] observed that pairwise-independent hash functions must 
be "dense" in the sense that a typical output in a typical function in the family must depend on 
a linear number of inputs. So despite their numerous nice properties, in terms of the number of 
input-output dependencies, pairwise-independent functions are quite complex. Motivated by an 
application to local cryptography, Bogdanov and Rosen [BR11] recently gave a way to bypass this 
barrier in the context of hardness amplification of "local" functions. 

In this work we study sparse extractors for all the entropy, these are extractors with a small 
number of overall input-output dependencies. We consider the more general notion of sparse 
extractor families. An extractor family for distributions of min-entropy k over {0, l} n with error e 
is a distribution H on functions {0, l} n x {0, 1} S — > {0, l} m where m < s + k such that for every 
distribution X over {0, l} n of min-entropy k, the statistical distance between (H, H(X,U S )) and 
(H, U m ) is at most e (where U s and U m are uniformly random). The extractor family is strong if 
s = 0, i.e. H does not take any additional randomness beyond X. 

Without the sparsity restriction extractors and extractor families are essentially the same object, 
as the randomness used to choose an extractor from the family can be included in the seed. Once 
we take sparsity into consideration, however, extractor families allow for more flexibility. This 
advantage is especially pronounced in the case of strong extractors: Any single strong extractor 
in which some output bit depends only on I input bits cannot extract from a source that fixes all 
those £ bits. In contrast, we show strong extractor families can achieve much better sparsity. 

In this work we prove three results regarding sparse extractor families. First we give a simple 
construction of sparse extractor families for all the entropy. Then we show that the sparsity of 
our construction is optimal up to constant factors for a wide range of the min-entropy parameter. 
Finally, we show that an equally simple construction of weak extractor families achieves better 
sparsity. Thus when sparsity is required, weak extractors can provably outperform strong ones. 

We also show our weak extractor family gives a somewhat improved nonuniform construction 
of local pseudorandom generators from local one-way functions, based on recent work of Vadhan 
and Zheng [VZ11]. In general our results can be useful in randomized or nonuniform settings where 
hashing is used and obtaining or preserving small input-output dependencies (i.e., parallelism) is 
of interest. 

1.1 Our results 

Let h: {0, l} n — > {0, l} m be a function. We say output j of h depends on input i if there exists 
assignments x,x' € {0, l} n that differ only in the zth coordinate such that h(x)j ^ h(x')j. We say 
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h is s-sparse if the number of input-output pairs such that output j depends on input i is at 
most s, and h is t-local if every output j depends on at most t inputs i. 

Theorem 1. Let K be a sufficiently large constant and n, k, m, 5 be parameters such that 1 < m < 
k < n, and < 8 < 1. Let H{x) = Mx, where M is an mxn matrix over GF{2) where each entry 
equals 1 independently with probability 

r 1 m Kn 1 1 

p = mm.\ — ■ log — In , - \. 

Im d ml) 

Then H is a strong extractor family for min-entropy k with error at most 1 /2\/5 + K ■ 2~ k+m . 

By a large deviation bound, all but an 5- fraction of H are 0(nmp)-sparse. The best error we can 
hope for [RT97] is Q(V2~ k+m ), which is achieved by a pairwise independent hash function family. 
Perhaps the simplest construction of such a family is to choose each entry of M independently at 
random with probability p = 1/2. When p < 1/2, Theorem 1 shows the sparsity can be reduced 
dramatically at the cost of increasing the error by a little. For example if we set 5 = 2~ k+m , we 
obtain a 0{n log m log (n/m)) sparse strong extractor family whose error is within a constant factor 
of optimal. 

Our main negative result shows that this sparsity is necessary for a large range of values of k 
and when S is constant. 

Theorem 2. Suppose n 0,99 < m < re/6. There exists a distribution T> over distributions X 
on {0, l} n of min-entropy 1.5m each so that for every function h: {0, l} n — > {0, l} m of sparsity 
0.001nlogmlog(2re/m), the expected statistical distance between h{X) and the uniform distribution 
over {0, l} m is at least 1 - e - m " (1) . 

Applying Yao's minimax principle or just take the covex combination of distributions X as the 
bad distribution, we conclude that the sparsity in Theorem 1 is optimal up to constant factor for 
this range of parameters. The next results concerns weak extractor families. 

Theorem 3. Let K be a sufficiently large constant and n, k, c, m be parameters such that 1 < k < n, 
1 < s < m and c > 1. Let H : {0, l} n x {0, 1} S — > {0, l} m be given by H(x,r) = Mx + Br, where 
M is an m x n random matrix in which each entry equals 1 independently with probability 

p = mini— • ln-^-, ]-}, 
Vm mc 2 j 

and B is an m x s {m > s) matrix of full rank where every set of at most m/2K rows is linearly 
independent. Then H is an extractor family for min-entropy k with error 1 /2V c ■ 2~ k ~ s + m . 

The construction of a matrix B with the desired properties and 0(m) sparsity is a well studied 
problem in the theory of low density parity check codes [Gal62, SS96]. Capalbo et al. [CRVW02] 
give an explicit construction with s = am for some constant a < 1 and every m, which is optimal 
up to the choice of the constant a. Instantiating Theorem 3 with this matrix, and setting c = 2, 
we obtain a family of 0(n log n) sparse extractors with error 0(V 2~ fc ~ s + m ), which is optimal up to 
constant factor. (If m = k + s — 0(1), the output contains almost all the entropy from the source 
plus all the entropy invested by the seed and the error is an arbitrarily small constant.) By using 
a larger value of c, we can reduce the sparsity at the cost of increasing the error. 
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We observe that using the randomized encoding of Applebaum et al. [AIK04a] , these extractors 
can be made to have constant locality at the cost of increasing the seed length r to 0{n log n) bits. 

For certain parameters the weak extractor family from Theorem 3 bypasses the limitation 
on strong extractor families from Theorem 2. For example, for constant statistical distance and 
m = n 0,99 Theorem 2 implies that for a strong extractor to produce even a constant fraction of 
the entropy a sparsity of 0(n(logn) 2 ) is necessary, while Theorem 3 says that an 0(n log n)-sparse 
weak extractor family can extract all but O(l) bits of entropy. 

1.2 An application 

Pseudorandom generators from one-way functions The construction of pseudorandom gen- 
erators from one-way functions of Hastad et al. [HILL99] does not in general preserve locality. 
Haitner, Reingold, and Vadhan [HRV10] gave a construction that is more efficient and can be 
implemented in NC 1 . Recently Vadhan and Zheng [VZ11] gave an even simpler variant of this 
construction. In combination with the "compiler" of Applebaum, Ishai, and Kushilevitz [AIK04b], 
one obtains a generic locality-preserving transformation of one-way functions into pseudorandom 
generators. 

Applying the transformation of Applebaum et al. may have an adverse effect on seed length, as 
it may grow quadratically. However, the construction of Vadhan and Zheng is extremely simple; it 
is obtained by applying extractor to a sequence of "blocks" , each of which inherits the locality of 
the one-way function /. Instantiating the extractor by the construction from Theorem 3, we obtain 
a transformation of nonuniform one-way functions into nonuniform pseudorandom generators that 
preserves output locality logarithmic in the size of the adversary with the same seed length as the one 
obtained by Vadhan and Zheng. Using an additional idea of Applebaum et al., the transformation 
can be made to preserve constant output locality at the expense of increasing the seed length. We 
describe this application in Section 5 (see Proposition 9). 

1.3 Related work 

Sparse extractors for restricted sources Motivated by certain applications, Zhou and Bruck 
[ZB11] show that low density random matrices can efficiently extract random bits from some re- 
stricted noisy sources, such as bit fixing sources and Markov sources. Our Theorem 1 shows that 
essentially the same construction extracts from arbitrary sources of given min-entropy. 

Extractors in NC° Applebaum, Ishai and Kushilevitz [AIK06] give a weak extractor in NC° 
(thus sparsity 0(n)) works for min-entropy k = (1 — 0(l))n, but suffers f2(n) entropy loss. Our 
extractor family from Theorem 3 matches these parameters. The construction from [AIK06] does 
not appear to extend to distributions of smaller min-entropy or allow for smaller entropy loss, while 
ours does. However, they provide a single extractor that works for all distributions, while we only 
give an extractor family. 

Locally computable extractors A locally computable extractor [Lu04, Vad03] is an extractor in 
which after the seed is fixed, the output as a whole depends on a small number of input bits. Such 
extractors are used to implement private-key encryption in the bounded storage model [Mau92]. 
We observe that the notions of locally computable extractors and sparse extractor families are 
fundamentally different. This is best illustrated in the regime in which we extract all the entropy, 
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which is of main interest in this work. A lower bound of Vadhan [Vad03] shows that when the 
output length m is linear in the min-entropy k, then even after the seed is fixed the output of 
the extractor as a whole must depend on at least a linear fraction of the input. Thus o(n)-locally 
computable extractors are not possible when m = Cl(k). Although this is inevitable, our results 
show that it is possible to make a small number of input-output dependencies. 

We observe that the locally computable extractors of Lu [Lu04] and De and Trevisan [DT09] are 
also sparse, but they extract only a fixed root of the min-entropy k. This is sufficient for bounded 
storage cryptography, but not for the application we describe in Section 5. 

1.4 Our proofs 

Bogdanov and Rosen [BR11] proved a quantitatively weaker version of Theorem 1 that achieves 
sparsity 0(n(log?i) 3 ) instead of the optimal 0(n log (m) log (n/m)). (They did not attempt to 
determine the dependence on m and they can achieve sparsity 0(n log(m) log(n/m) 2 )) In their 
proof, H is viewed as a collection of boolean functions (hi, . . . , h m ), hi : {0, l} n —> {0,1}. They 
show that for most choices of hi, conditioning on h\(x) reduces the min-entropy k of x by at most 
1 + l/poly(fc) bits (unless k is very small), and so this bit extraction can be applied iteratively for 
m steps. 

One drawback of this argument is that as i gets larger and the min-entropy of x conditioned 
on hi(x), . . . , hi-i(x) becomes smaller, the density of the functions hi must keep increasing (as 
required by our lower bound). To achieve our optimal (up to constant) sparsity, we must analyze 
the effect of all the functions hi, ... , h m simultaneously. 

To do this, we upper bound the probability that two samples x, x' collide under h, that is the 
probability that h(x + x') = 0. For a fixed pair (x, x'), each entry hi(x + x') of h(x + x') is biased 
towards zero. We can think of h{[x + x') as a random variable that takes value zero with some 
probability p(x,x'), and is unbiased otherwise. Intuitively, our analysis shows that the unbiased 
components of this distribution dominate in collisions. Several technical complications arise in the 
formal argument. One useful tool that allows us to analyze the case when most of the components 
of h(x + x') are unbiased is Holder's inequality. 

To give an idea of our proof of Theorem 2, let's make the simplifying assumption that h is 
•£-local, where t = "y(n/m) log m log(n/m). We give a heuristic argument why we expect the output 
of h to be far from uniform when h is linear. Let X be the p-biased distribution over {0, l} n (each 
bit takes value 1 independently with probability p) and p is chosen so that H(p) = m/n, where 
H(p) is the binary entropy of p. Then the distribution X has Shannon entropy m. However, every 
output bit of h(X) is (1 — 2p)^-biased, and we chose the parameters so that (1 — 2pf = m~ n ^\ By 
choosing 7 small enough, we can ensure that every output bit of h has, say, m -1 / 2 bits of entropy 
deficiency, so by the sub-additivity of Shannon entropy h(X) has m 1 / 2 fewer bits of entropy than 
a uniformly random variable over {0, l} m . So h(X) does not "look" random in terms of Shannon 
entropy. 

To turn this heuristic argument into a proof we need to handle several issues, the most interesting 
of which is replacing entropy deficiency by statistical distance from the uniform distribution. One 
advantage of measuring entropy deficiency is that entropy is subadditive, which allows us to ignore 
the dependencies between the various outputs of h(X) in the above argument. In contrast, to 
obtain a good lower bound on statistical distance we must take into account these dependencies. 
Here we apply tail bound for read t family [GLSS12]. To extend the analysis from linear functions 
to general ones we apply an elegant idea of Viola [Vio05] of shifting X by a random offset. 
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We establish Theorem 3 by a relatively straightforward probabilistic calculation. 



1.5 Open problems 

In terms of seed length our sparse extractors are quite poor. For example the size of the family H in 
Theorem 1 is exponential in n. By a standard probabilistic argument it can be shown that a random 
sample of H of size 0(n/e 2 ) is as effective as the whole family while incurring an additional penalty 
of only e in statistical distance. Consequently there is no existential obstacle to sparse extractor 
families with short seed. It remains to see if such families can be found efficiently 

For weak sparse extractors, our work leaves open two possible improvements. First, we do not 
know if the sparsity of our weak extractors is the best possible. Second, we do not know what is 
the minimal size of a sparse weak extractor family. It could be that even a family of size 1, i.e. 
a single sparse weak extractor, is sufficient. Such an extractor could be used to obtain a uniform 
construction of local pseudorandom generators from local one-way functions. Could there be a 
single sparse weak extractor of sparsity linear in n that extracts k — fc - 99 bits of min-entropy for 
every source over {0, l} n of min-entropy k? 



2 Proof of Theorem 1 

To prove Theorem 1 it is sufficient to show that for every set S of size 2 fc , the statistical distance 
between (H,H(X)) and (H, U) is at most 1 /2\f5 + 0(2~ k+m ), where X is chosen at random from 
S and U is uniformly random. In fact we will show for every xq € S, 

from where 

Pr H , x ,x>[ H ( X ) = H ( X ')} < max^sPr^xItfpT) = H(x )\ < ^ '- 

where X and X' are independent samples from S. This is sufficient to establish Theorem 1 using 
the relation between collision probability and statistical distance from Claim 10 in Appendix A. 

Proof. Whenp = 1/2 the analysis is standard, so we will assume that p = ^dog(m/5) ln(15ra/m) < 
1/2. Since entries of M are chosen independently from each other, for any y £ {0, l} n , we have 

Pr H [ ffW = 0] = *.[(■,„) = 0]" = ( 1 + (1 - 2rt "" ) m = ± £ ( m ) (1 - 2p)*l 

i=0 ^ 1 ' 

Here a ~ {0, l} m is chosen from the p-biased distribution. Let So be the set {xq + x : x G S}. Then 
Pr H , x [H(X) = H(xq)] = Pr H ^ So [H(y) = 0] 

= E y~S 



_Lf (7)<i -*)<*] -±£(7)^(1-*)*] 



i=0 x y i=0 



Let a, = E^ So [(l - 2p)^l]. We now upper bound the sum YT=o (?) a * by 1 + 5 + 0{2~ k+m ). 
We will consider two cases: When i is small - specifically, i < k/(2 log (m/5)), we show that 
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decreases at a rate faster than (m/5)~ % , so the sum is dominated by the term i = 0. When i is 
large, we want to bound both a, and (™) by its largest possible value. To achieve this, we have 
to further split the large i's into "mildly large" and "very large" ones and apply the argument 
to each summation separately. The resulting contribution is 0(2~ k+m ). Notice that, in the case 
m < /c/(21og (m/5)), we need not consider the contribution of large i's thus we can improve p to 
be ^ • log(m/<5) ln(15n//c) by using the same analysis for small i's. 

The small i's. We show that if i < /c/(21og (m/5)), then ai < (m/5) -1 ' 8 * and therefore 

fc/(21og(m/<5)) 

£ (7 ) * < (1 + (m/Sy^r < e sl - 8 / m °- 8 <l + 5. 

To bound Oj we apply Holder's inequality, which says that for every B > 1: 
a i = E^5 [(l-2^l] = -L £ Wo . (1 _ 3p)«lvl < J_ |5H-i/B £ ((l-2p)^l) 1/B 

J/6{0,1} ?1 ' ' ?/e{0,l} n 

The last expression can be simplified to give 

, (1 + (1 - 2p) Bi ) n \ 1/B / (1 + e - 2 P iB ) n \ i/B 
a,; < rr^ < 



\S\ ) ~\ \S\ 

We choose B = k/(2ilog (m/5)), which is at least one because i < k/(2log(m/5)). By our choice 
of p, it follows that 2piB > ln(15n/m) and so 

The large i's. We have that 

* = E^ o[( i - 2p) *i, < > E (1 - = " + ",-^>- < <i±p:. (d 

' ' ye{0,l} n ' ' ' ' 

When z > m/4, the last expression is at most (1 + e-P m / 2 ) n /|5|. By our choice of p, pm/2 > 
1 /2logmln(15n/m). Optimizing for logm, it can be calculated that this expression is at least Inn 
when n is sufficiently large. It then follows that 

(H < (1 + 1/nr < e2~ k and so h) ^ < 2 m ■ e2~ k = e2^ k+m . 

I I i=m/4 ^ ' 

Finally, we handle the i's in the range fc/(21og (m/5)) < i < m/4. Using (1) and the lower bound 
on i, we have that 

(l + m/15n)" e" 1 / 15 . lm _ fc 
W - 1^1 — 

and so 

I m \ a , < 2 o.im-fe ( m I < 2°- lm - fc . 2 // ( 1 / 4 ) m +°( 1 ) < 2- fe+m +°( 1 ) 

fc/(21og(m/<J))<t<m/4 ^ * ' i=0 ^ * ' 

where -ff is the binary entropy function, and i?(l/4) < 0.9. □ 
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3 Proof of Theorem 2 



Let distribution X be a truncated variant of the p-biased distribution X where p is chosen so that 
H(p) = 2m/n and p < 1/2. The distribution D on distributions is defined as follows. Choose y 
uniformly from {0, l} n and output y + X. To prove Theorem 2, we will show for most choices of 
y, there exists a statistical test T y that distinguishes h(y + X) from the uniform distribution U 
and then argue the expected statistical distance over choice of y between h(y + X) and U is large. 
We will define T y shortly for y £ {0, l} n and first argue that for most choice of y, T y distinguishes 
h(y + X) from U. Then we will show how to define X of min-entropy 1.5m in a way that X and 
X are statistically close. Finally, we conclude that the expected statistical distance over choice of 
y between h(y + X) and U is at least 1 — e _mQ(1) . 

The following bounds on p are obtained by plugging in H(p) = 2m/n in Lemma 11 in Ap- 
pendix B: 

™ <p< ^ . (2) 

3nlog 2 (ra/m) nlog 2 (n/2m) 

Now suppose h has sparsity (m/2p)/3 log m, where (3 is a sufficiently small constant, say /3 = 0.08. 
Notice this /3 also satisfies n < m 1+2 ^ (since by assumption m > n 0,99 ). Partition the inputs of 
h' into two sets H and L, where H contains those inputs that participate in at least m 2_6 ^/pn 
outputs of h', and L contains the rest. By Markov's inequality (using the assumption n < m 1+2 ^), 
H has size at most m 8/3 f3 log k. For x E {0, l} n , let xq and x\ denote its projections onto H and L, 
respectively. For every y £ {0, l} n , we define the statistical test 

T y = {z e {0, l} m : A(h(x ,yi),z) < 1/2 -m _/3 /4 for some x ,} 

where A(a, 6) is relative Hamming distance between the strings a and b, i.e. the fraction of positions 
in which they differ. 

Claim 4. For sufficiently large k, Prx[h{X + y) G T y ] > 1 — e - m313 / 2 for at a least 1 - 1 - e ~ mifi I 2 
values of y £ {0, l} n . 

In the proof we will need the following fact about Boolean functions /: {0, l} d — > {0, 1} 

Prx,Y\f(X + Y)^ f(Y)] < 2p) d (3) 

where Y is uniformly distributed in {0, l} n , and X is chosen independently from the p-biased 
distribution on {0, l} n . This fact follows easily by Fourier analysis [O'D02] and was also used by 
Viola [Vio05] in a context related to ours. 

We will also make use of the following inequality of Gavinsky et al. [GLSS12]. A collection of 
indicator random variables Z\ , . . . , Z m is called a read t family of functions if there exist independent 
random variables X\ , . . . , X n such that each X^ 

Then we will apply tail bound for read t family of functions [GLSS12] to show for most choice 
of (x,y) outcome concentrate on expectation. Indicator random variables Z%, . . . , Z m is a read t 
family if they can written as a function of independent random variables X\ , . . . , X n where each 
Xi affects at most t of the Z^s. Then for every e > 0, 

Pr[Z > E[Z] + em] < e^ 2 ™' 1 . (4) 

where Z = Z\ + • • • + Z m . 
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Proof. We will show that for every choice of xq,i/o, with probability 1 — e~^( ml 13 ) over the choice 
of xi,yi, h(x + y) = h(xQ + yo, + Vi) is m Fix #o> 2/0 an d consider the function h Xo+yo (x\) = 
h(x + yo,xi). Let Z = Z\ H h Z m , where 



Zi 



1, if^ 0+W (^i + l/i)^^ 0+W (lft). 
0, otherwise. 



Suppose h* 0+Vo depends on dj inputs for 1 < i < m, by (3) we have 

E[Z t ] < 1/2(1 _(l-2p)*). 
By linearity of expectation and ^ — (i r n/2p)/3logm, we get 

E[Z] < m/2(l - (1 - 2p)^ lo s m / 2 P) < m(l/2 - m^/2). 
Now we apply tail bound (4) to Z±, . . . , Z m with t = m 2 ~®P /pn and e = m~^/4 to obtain 

Pr[Z > m(l/2 - m-^/4)] < e -2c 2 W(m 2 - 6 ^n) < e -m^ 
where we used the estimate (2) to lower bound pn. In other words, 

Pr xl)Wl [A(h xo+yo ( yi ), h Xo+yo ( Xl + yi )) < 1/2 - m -P/A] > 1 - e~ m3P . 

It follows that 

Pr x , v [h(x + y) G T y ] > E XOjVo [Pi Xuyi [A(h Xo+yo ( yi ), h Xo+yo { Xl + Vl )) < 1/2 - m^/4]] 
> i - e ~ m3 \ 

Applying Markov's inequality, we conclude that for at least 1 — e -m3t 7 2 choices of y, 

E x [h(X + y)eT y ]>l-e- m3 "/ 2 . □ 

Claim 5. For any fixed y & {0, l} n , with probability 1 — 2~ n ( ml 2(9 ) over the choice of a uniform 
U ~ {0, l} m , U is not in T y . 

Proof. Since H has size at most m 8/3 filogk, the range of h'(xQ,y\) has at most 2 m8,3 ^ logm elements. 
For every such element h(xo,y±), the probability that U is within distance m/2— m 1_ ^/4 to h(xo, yi) 
can be computed by Chernoff bounds to be at most 2~^ ml 2 ). Taking a union bound over all 
such h(xo,yi), we obtain 

Pr[U G T y ] < 2 mSf) l31o z m 2~ n( - ml ~ 2 ^ = 2~ f ^ ml ~ 2 ' 3 ) 

as long as /3 < 1/10 and m is sufficiently large. □ 

From these two claims, it follows that for a 1 — e _m3 ' 3 / 2 choices of y, 

Pr x [h(X + y) G T y ] - Pv v [U G T y ] > 1 - 1 - e^/ 2 - 2 ^( ml " 2/3 ). (5) 

To finish the proof, we show how to replace X with another variable X of min-entropy at least 
1.5m that is statistically close to it. We define X as follows: First, choose X from the p-biased 
distribution. If the Hamming weight of X is at least 0.9pn, set X = X. Otherwise, let X be 
uniformly random in {0, l} n . We prove the following claim in Appendix C: 



S 



Claim 6. X has min-entropy at least 1.5m. 

Clearly the same conclusion holds for the distribution X + y. The statistical distance between 
X and X is upper bounded by the probability that X has Hamming weight less than 0.9pn. By 
Chernoff bounds, the probability of this is at most exp(— Cl(pn)), which using the lower bound 
(2) is at least exp(— £l(m/ log(n/m))) = exp(— m n ^) (since m > n~°' 99 ). Applying the triangle 
inequality, for all y satisfying (5) we have 

Pr T [h(X +y)G T y ] - Vt v [U G T y ] > 1 - e'™^ . 

We conclude that the expected statistical distance between h(X + y) and U m for a random choice 
of y is at least (1 - e" mn<1) )(l - e -™ n(1) ) = 1 - e -™ n(1) . 

4 Proof of Theorem 3 

As in the proof of Theorem 1, it is sufficient to show for every set S of size 2 k and every xq in S 
and ro in {0, 1} S , 

?r M ,xA MX + BR = Mx + Br ] < I W >— 

where the probability is taken over the random matrix M, X chosen uniformly from S and R chosen 
uniformly from {0, 1} S . 

Assume that p < 1/2. Let 5o be the set {x + xo : x 6 S}. Then 

Pr M ,xM MX + BR = Mx + Br ] = Pt m , x>r [M(X + x ) = B(R + r )] = ?t m ,y,r [MY = BR] 

where Y is a random element from Sq. Let Mi, Bi denote the ith row of M and B. Then 



™ 1 _l (_-[\MiY+BiR 

?r[MY = BR] = Em,yMU a 1 



i=l 

= 7^ E Em^[(-1) E - M '™] 4 E EMM(-l) S - M ' y ]E fl [(-l)^,^]. 
TC[m] TC[m] 

Since any t = m/2K rows of B are linearly independent, for every nonempty T of size at most t, 
^2i£T-Bi 7^ and so E[(— l)S; e T B i R ] = o. On the other hand for every T of size at least t we have 

EM,H(-i) E - M ' y ] = ^ E Em[(-i)^- m ^] ^Ef 1 - 2 ^ M ' |T| 

< ¥ E (l-2^'<^(l + (l-2p)T<^. 
2,e{0,i}« 
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Since B has full rank, the condition X^ieT &i = is satisfied for at most 2 m s sets T. Hence, 
£ E M ,y[(-l) E ^ M ' y ]Eii[(-l) E -^ B ' fl ] = l+ ^ E M ,y[(-l)^€T^ ]Eii[( _ 1) E i6 T^] 

TC[m] T:|T|>t 

ne -2pt 

T:\T\>t 

< 1 + 2 m " s 



2 k 

£ -2pt 



Plugging in t = and p = ^ • In ^ we get the desired bound. 

5 Local pseudorandom generators from local one-way functions 

A sequence of correlated random variables Xx, . . . , X m taking values in {0, l} n has (s, e) conditional 
pseudo-min- entropy r if for every 1 < % < m, there exists a random variable Yi jointly distributed 
with Xi, . . . , such that the min-entropy of Y conditioned on any choice of Xx, . . . , is at 
least r and for every circuit D of size s, 

\Pv[D(X l )\X l ,...,X i _ l ]-Pi[D(Y i )\X l ,...,X t _ 1 ]\<e. 

Vadhan and Zheng [VZ11] give the following construction of conditional pseudo-min-entropy 
sequences from a one-way function /: {0, l} n — > {0, l} n . Let Zi, 1 < i < i be the random strings 

^ = Oj L/^il) ° X H ° ■ ■ ■ /(^ifc) ° ^ifcjn-oi 

where 1 < < n is an offset, sc^i, . . . , x^ are random strings, and f[y\i denotes truncating the 
first / and last t bits of y respectively. Let Xj = Zxj%2j ■ ■ ■ Ztj, where Zij denotes the jth bit of «j. 
Vadhan and Zheng prove the following theorem (we state it in the nonuniform setting). 

Theorem 7 (Vadhan and Zheng). Suppose f : {0, l} n — > {0, l} n is computable by a circuit of 
size poly(n) and is hard to invert on a 1/s fraction of inputs by circuits of size s. There exists 
offsets ox, ■ ■ ■ , ot such that for every e, Xx, • • • , X m has (■s^( 1 ^/poly(ne), e) conditional pseudo-min- 
entropy at least i (1/2 + f2((log s)/n)) where k = 0(n/ log s), t = 0((n/ log s) 2 log 2 re log (1/e)) and 
m = 2(k — l)n. 

The following claim was proved in the uniform setting by Haitner, Reingold, and Vadhan. We 
need a nonuniform version of it, whose proof is analogous. We include it at the end of this section 
for completeness. 

Claim 8. Suppose Xx, ■ ■ ■ ,X m (where Xi takes values in {0, l} 1 ) has (T,ex) conditional pseudo- 
min-entropy a. Let H be an extractor family for min-entropy a with error Ei so that every function 
in H is computable in size Tq. Then with probability at least 1/2 over the choice of H the distribution 
{H{Xx, Rx), ■ ■ ■ , H(X m , Rm)) is {T—vtiTq, mex+2m 2 E2) -pseudorandom where Rx, . . . , R m ~ {0, l} r . 
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Instantiating Claim 8 with the function family from Theorem 3 where we set the output length 
of function to be i(l/2 + O((logs)/n)) + r and let d be the entropy loss. We obtain the following 
consequence for the function 

G(xu, . . .,x t k,n, . . . ,r m ) = (H(X 1 ,r 1 ), . . .,H(X m ,r m )). 

Proposition 9. Suppose f: {0, l} ra — > {0, l} n is an (.-local function computable by a circuit 
of size poly(n) and is hard to invert on a 1/s fraction of inputs by circuits of size s. With 
probability at least 1/2 over the choice of H, G : {0, l} nkt + mr _> {0, i}n«(i+n((iog s)/n))+mr - g an 
0(1 ■ (nkt In (t/ In c) +mr)) -sparse, (s^ 1 ^ /poly(ne), poly(n)(e + y / c2~' i//2 )) pseudorandom generator 
where k = 0(n/logs), t = 0((n/ logs) 2 log 2 n log (1/e)) and m = 2(k — l)n. 

We can improve the locality of G at the expense of increasing its input and output length by 
the factor of 0(ln(i/lnc)) via the following transformation of Applebaum, Ishai, and Kushilevitz. 
For every output of G, which is obtained by applying a sparse linear transformation to some Xj 
and therefore has the form 

x jki H 1- Xjk t 

introduce auxiliary new inputs rjs,rj&, . . . ,rju_i\ for G and replace its corresponding output by 
the tuple 

(Xjk! + X jk2 + rj3, 7-j 3 + Xj k3 + rj 4 , . . . , r jt _ 1 + -Xj(t-i) + Xj t ). 

Call this new function G. Applebaum et al. show that if G is (s^ 1 **, s^^^-pseudorandom, so is 
G' . Since every bit of Xj comes either from some input xi or from some output f(xi), it follows 
that if / has locality £, then G has locality 31. 

Proof of Claim 8. Let Yi be the conditional min-entropy model for X{. We consider the hybrid 
distributions 

X« = (H(X 1 ,R 1 ),...,H(X^ 1 ,R l . 1 ),H(X l ,R i ),U l+1 ,...,U m ) and 
yW = (H(X 1 ,R 1 ), HiX^Ri-x), H(Yi, R t ), U i+l ,..., U m ) 

where U±,..., U m are uniformly random and independent. By the definition of conditional pseudo- 
min-entropy, for every i the distributions and are (T — ijiTq, ei)-indistinguishable. Because 
H is an extractor family, the distributions (H,H(Yi,Ri) \ X±, . . . , Xi-\) and (H,Ui) are within 
statistical distance £2 for any choice of Xi, . . . , JQ_i. It follows that (H,Y^) and (H, X^ l ~^) 
are within statistical distance at most £2, so by Markov's inequality F" and X^~^ are within 
statistical distance 2m£2 with probability at least 1 — l/2m over the choice of H. By a union 
bound, with probability at least 1/2 over the choice of H, Y^' and X^ % ~ 1 ^ are 2m£2-statistically 
close for all i. For such a choice of H, by the triangle inequality X^ is (T — mTo,mei + 2m 2 £2) 
indistinguishable from Jf(°). Since X^ = (H{X 1 Ri),...,H(X m ,R m )) and X<® is the uniform 
distribution, we obtain the desired conclusion. □ 
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A Statistical distance versus collision probability 

Here we give a standard bound of the statistical distance of the distributions (H, H{X)) and (H, U) 
in terms of the collision probability of H. The proof is very well known when the distribution over 
H is uniform. In our application, however, it is not, so we include the proof for completeness. 

Claim 10. Let H be a distribution on functions from {0, 1}™ to {0, l} m , X be a distribution over 
{0, l} n and U be the uniform distribution over {0, l} m . The statistical distance between (H, H(X)) 
and (H, U) is at most 

1 



-^2 k YT H ^ x ,[H{X) = H(X')] - 1. 
where X and X' are independent samples from the same distribution. 
Proof. We upper bound the l\ distance, which is twice the statistical distance: 

h((H, h(X)), (H, U)) = V \Pt HjX [H = hA H(X) = b] - Pt h [H = h]2' m ) \ 

* — 'h,b' 

= E, h V?TH[H = h] ■ ^/Pr H [H = h] PvxlHX) = b}- 2~ m ] 



^ „ Fth ^ H = ^ • \U2 hh PT hi H = h]{Pr H>x [H(X) = b}- 2— ]) ; 



2 m • JPt h ,x,X'[ h ( x ) = H i X ')\ ~ 2 " 



■in 



2 m Pi H ,xM H ( X ) = H ( X ')l - L D 



B Bounds on the inverse of entropy 

Lemma 11. For every p G (0,1/2], 

H(P) < < H(p) 



61og 2 2/ 'Hip) - log 2 l/ H(p) 
The upper bound on p follows from the inequality H(p) > plog 2 1/p. Applying twice we obtain 
1 1 , 1 1 , / 1 , !\ 1 , 1 

^ TFTT lo S2 - > T7TT lo g2 TFTT lo §2 ~ > T7TT lo g2 



p ~ H{p) ^ p~ H{p) z \H(p) b2 p) - H(p) bZ H(p) 
because 1/p > 2. For the lower bound, we apply H(p) < 2plog 2 1/p twice to obtain 

1 2,1 2 , / 2 , 1\ 

p-m 82 P~m g2 ^m S2 p)' 

Now 2/H(p) > (1/p) log 2 (l/p) > y / log 2 (l/p), which is true for every p € (0, 1]. Therefore 

1 2 / 8 \ 6 / 2 \ 
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C Proof of Claim 6 



The maximum probability in X is attained by those strings in {0, l} n that have Hamming weight 
exactly 0.9pn. Let a be such a string. Then 

Pr[X = a] < Pi[X = a}+ Pr[U = a] 

_ p0.9pn^ _ _|_ 2~n 

< 2~ nH ^ ■ p~°- l P n 4- 2~ n 

= 2~ nH ^ ■ 2°- lplog 2( 1 / p ) n + 2~ n 

^ 2~nH(p) _ 20-l"J^(p) _|_ 2 — n 

_ 2~0-9 n ^(p) _|_ 2 — 71 

_ q— 1.8m I ij— n ^- q— 1.5m 
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